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Abstract 

We consider the problem of getting a computer to follow reasoning conducted in dynamic 
logic. This is a recently developed logic of programs that subsumes most existing first-order 
logics of programs that manipulate their environment, including Floyd's and Hoare's logics of 
partial correctness and Manna and Waldinger's logic of total correctness. Dynamic logic is more 
closely related to classical first-order logic than any other proposed logic of programs. This 
simplifies the design of a proof-checker for dynamic logic. Work in progress on the 

implementation of such a program is reported on, and an example machine-checked proof is 
exhibited. 
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S.D. Litvintchouk and V.R. Pratt 
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Introduction 

The logical language. 

Our objective is to be able to discuss programs with a computer. The prerequisites are a 
language for holding the conversation in, and reliable criteria for following a line of reasoning 
expressed in this language. We adopt a simple language having just four basic constructs. 
Three of these constructs come from ordinary logic; they are function symbols, predicate 
symbols, and logical connectives. (We lump constants and variables together with the zeroary 
function symbols.) The fourth construct, while not a familiar one in logic, is nevertheless one 
that occurs in everyday conversations about programs; it is the notion of "after executing 
program a” For example we may say in ordinary conversation, "After executing the program 
X:=l, X is equal to l. H 

While these four constructs may not seem very much to go on, they are in fact sufficient 
for almost any "first-order" conversation about the input-output behavior of programs. They 
may express such diverse concepts as partial correctness, termination, equivalence, determinism 
versus nondeterminism, totality, reversibility of a process, accessibility of slates, weakest 
antecedents, strongest consequents, weakest and strongest invariants, and convergents. They 
also shed new light on the axioms relevant to quantifiers in first-order predicate calculus by 
treating them from the programmer's point of view rather than the logician's. 

We abbreviate "after executing program a" to Ca], so that the observation of the first 
paragraph condenses to CX:=1]X=1. (We have found it convenient in conversation to pronounce 
C«3 as "box a.") We shall later find useful the dual concept ->[<*]-' which we write <a>, 
pronounced "diamond a." The notation is borrowed from modal logic. Dynamic logic is more 
intimately connected with modal logic than one might at first suppose; the connection is 
discussed in more detail in section 3.2 of C21). Fischer and Ladner C63 demonstrate the 


connection between various restrictions of dynamic logic and the classical systems K, T, S4 and 
S5 of modal logic. We call Co] and <o> modalities (respectively box and diamond modalities), 
and formulae of the form Co]P and <o>P modal formulae. We shall call a quantifier-free logic 
augmented with such modalities a dynamic logic. Syntactically, modalities behave exactly like 
logical negation; they are placed in front of a formula, and their precedence is such that 
CoDPaQ is parsed (Co3P)aQ, just as -<PaQ would have been parsed (->P)aQ. 

Ihejiogn^ 

In order to understand the meaning of a formula such as -TX:=l]false, we first need a precise 
account of X:=l. We shall think of programs solely in terms of their effect on the state or the 
world. A state is defined by the values taken on by the function and predicate symbols of the 
language in some domain. (A logician would call a state an interpretation.) We call the set of 
all possible states (keeping the domain fixed) the universe. Thus a universe is defined by the 
available function and predicate symbols and the choice of domain. 

We could restrict our attention to deterministic programs, permitting us to think of them as 
functions from states to states. As we shall see later however, reasoning nondeterministically 
about deterministic programs can simplify the argument. Hence we shall allow for 
nondeterministic programs by capturing the effect of a program on a universe as a binary relation 
on that universe. This of course means that we will be able to discuss nondeterministic 
programs in general. However, the question or what first-order facts one wants to assert 
about nondeterministic programs is presently the subject of much discussion in the literature (see 
CS] in particular), and we shall avoid that issue in this paper beyond observing that dynamic logic 
as presented here can express many useful ideas about nondeterministic programs. 

In treating programs as binary relations we shall make use of the usual notation that Safi is 

true just when $ and # are related by a. It is convenient to identify the relation a as its graph, 
the set of pairs of stales related by o. 

Program ming constructs 

^ "” 11 1 ' .» ir i T rn i un til 

The programs we want to discuss have five constructs. These constructs, while not all 

entirely conventional, have been chosen primarily for the ease with which one can discuss 
programs written using them. 

(i) Assignments. X:=l is an instance of an assignment, as is C(l,K):=C(l,K)+A(l,J)xB(J,K). In 
general an assignment is a pair of terms (respectively the left-hand and right-hand sides of the 




assignment) of our logical language. (A term is an expression constructed solely from function 
symbols.) We shall take zeroary function symbols to be ordinary variables. Then when the 
left-hand term is simply a zeroary function symbol the assignment is simple variable assignment; 
for other left-hand sides we have array assignments. Formally, the simple variable assignment 
where t is an arbitrary term, is {(9,#)|X£=t<j and otherwise 9=$}* (X^ denotes the value 

of X in $f, t<j the value of t in 9.) Array assignments are slightly harder to define; see C21X 

00 Tests. Conditionals are usually introduced with "if-then-else/* However the rules of 
reasoning (our axiom system) can be simplified by using a "smaller" notion of conditional, the 
testg which we shall use in conjunction with the next two constructs to synthesize if-then-else. 
X>0? is an instance of a test, as is J“OvPattern(J)=Text(K)?. In general a test P? is constructed 
from a formula P of the logical language. The idea of a test is that a computation may proceed 
past a test just when that test evaluates to true in the current environment, otherwise the 
computation must block (which for our purposes is equivalent to going into an infinite loop). 
Formally, the test P? is the restriction of the identity binary relation on the universe to those 
states satisfying P, i.e. {(9,9)i9NP), Most of what we say holds even for tests containing 
modalities, corresponding to the side-effect-free programming construct "if P would be the 
result of running ts then..." However we shall confine our examples to the more familiar 
modality-free variety. 

(Hi) Alternation. Execution of a|0 means the execution of either one (but not both) of a and 
0) the choice being made nondeterministically. Formally, the relation tx\0 is the union of the 
relations & and 0. 

(iv) Sequencing. Execution of a\0 means execution of first a then 0. Formally, a;/9 is the 
composition of a and 0. 

Using tests, alternation, and sequencing, we may express "if P then a else 0,” where P is a 
formula and a and 0 are programs, as P?;«hP?;0. In effect, P and ->P act as "guards," to use 
Dijkstra's IES3 terminology. P?;o can only be executed when P holds, and conversely for -, P?;/3. 
Hence when P is true P?;o|-'P?;/3 must be equivalent to a, and otherwise to 0, which is the 
property "if P then a else 0" should have. 

(v) Iteration. Execution of a* means executing a zero or more times, the number of times 
being chosen nondeterministically. Formally, is the reflexive transitive closure of a. 


Using tests, sequencing, and iteration, 
spirit as if-then-else, namely as (P;a)*;->P. 


we may express "while P do a" in much the same 
This permits a to be iterated for as long as P 
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remains true. Moreover, the iteration may not terminate while P remains true, on account of 
the *>P guarding the exit. (This usage of a guard at the end is not permitted in C51) 

With these five constructs we can express any flowchart program that has decision boxes 
and manipulates arrays. This can be done without introducing additional variables. This 
follows from the fact that state transition diagrams can always be translated into equivalent 
regular expressions. This is not possible with assignments, sequencing, if-then-else and while- 
do £23, the difference having to do with the determinism of the latter. 

Quash programming constructs 

In addition to the five constructs for our programming language, we shall find two more 
constructs of interest, not in writing programs but in talking about them. 


(vi) R andom as s | gn ment. X:=? is an instance of a random assignment, which 

nondeterministicaily assigns an arbitrary element of the domain to X. Consider the sense of 
CX:=?3(X<0vX»0). This says that no matter what element is assigned to X, after the assignment 
X will be either negative or non-negative. This captures what is meant by VX(XCOvX^O). 
This demonstrates that we can introduce the ordinary quantifier VX into dynamic logic as just 
another modality CX:=?3. Though we shall adhere to the standard notations VX and 3X it should 
be understood that these stand respectively for CX:=?3 and <X:=?>. 


( V| i) C o n verse. Execution of a" means the reverse execution of a. Formally, a~ is the 
converse of a, satisfying Saft = 3, This permits us to reason either forwards or backwards 

about a program's behavior. We mean this in the sense that (i) [a3P is a claim made before 
execution of a based on the claim P that is supposed to hold after execution of a (backward 
reasoning), and (ii) Ca"3P is a claim made after the execution of a based on the claim P that is 
supposed to hold before execution of a (forward reasoning). We have not capitalized on 
converse as much as we would like in our work to date. 

Truth-value Seman tics of Dynamic logic 

— -- mm .mm " ■ ■ m .m ■ ■ 


Now that we have settled on the programming language, we can return to the question of 
what -’CX:=13false means, or more generally what any formula containing [a]P means. It is 
important here to realize the distinction between truth and validity. What we are about to 
define is the truth value of a formula of dynamic logic in a single state. This is to be 
contrasted with, say, Hoare's notion of ”P{o}Q,” whose truth is not defined on a stale-by-state 
basis but rather is defined for the whole universe, and so corresponds to the usual notion in 
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logic of validity. 

In state 3, [<*3P is true just when P is true in every state <} satisfying 3a$. That is, P is 
true no matter which state a terminates in when started in state 3. It follows that <er>P is 
true just when P is true in some state <J satisfying 3a$, that is, when it is possible for a to 
terminate and satisfy P if started in state 3. 

Expressive power of dynamic logic 

We may now show how dynamic logic may be used to express a variety of concepts. 

PCa>Q MPs [a] Q) 

Termination ana Iogue MP3<a>Q) 

of P-Ca>0 

a=(3 |=VX( (<a>Y=X)=(<0>Z=X)) 

(This assumes that Y,Z are the respective output variables of «,(3. 

While this generalizes to programs with any finite set of output 
variables, it does not generalize to programs with arrays as output 
unless we introduce second order quantifiers.) 
a deterministic bVX(<a>Y=X d (a)Y=X) 

(As for equivalence, Y is the output variable of a.) 
a always halts h<a>true 

a 1-1 l=VX(<cf>Y=X 3 Ea"IY=X) 

a onto h<cT>true 


q halts in this state 
weakest antecedent 
strongest consequent 
weakest invariant 
strongest invariant 
convergent 

The reader wishing to pursue these concepts further is referred to [93. Some simple 
statements expressible in dynamic logic that do not fall into any of the above categories, and are 
not expressible in Hoare's partial correctness formalism or the total correctness formalism of 
Manna and Waldinger C17D, are: 


<a>true 


[«IP 

<ot~>P 

[a*]P 

<a - *>P 

<a N >P 


see [9] for proof) 


It 


II 


II 


II 


II 


I! 


II 


II 


"If you set Y to X+5 and then 


[Y:=X+5;Y:=Y+2*] 
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add 2 to Y an indefinite <Y:=Y-1*>Y=X 

number of times then it is 
possible by repeatedly 
executing Y:«Y-1 to make Y=X" 

"If P holds then after Pota]<a _ >P 

executing a we will be in a 
state accessible via a from 
some state satisfying P." 

All of the above concepts can be stated in a second order logic that permits explicit 
manipulation of states and/or programs as individuals, as in C33 where states can be quantified 
over, or Q(0 where programs are terms. The interest in dynamic logic is that it achieves its 
expressive power using only first-order language. The advantage of keeping the language 
restricted in this way is that it is easier to completely axiomatize parts of the logic, though loops 
present an insurmountable obstacle to completeness as demonstrated in Theorem 16 of [211 


An axiom system for dynamic logic 

Before axiomatizing the programming language, let us begin with a sound complete axiom 
system for first-order logic. A novelty of this system is that it separates into logical and non- 
logical components what are usually taken to be entirely logical rules and axioms, on the principle 
that facts about X:=? are program-specific facts. This permits a programmer to apply his 
intuition about programs to the problem of understanding the significance of each axiom. 

Logical Axioms 

All tautologies of Propositional Calculus, 
tot] (PdQ) d ([a]P o to]Q) . (Axiom M) 

Logical Inference Rules 

l~ Q . (Modus Ponens) 

P h ta]P (Necessitation; subsumes generalization, P h VxP ). 

Non-logical Axioms 

VXP o Pj (any term T; pj is P with T for X) 

P d VXP unless X occurs free in P 

Axiom M can be thought of as a claim about programs; it says that for all states 9, if P 






implies Q in every state $ that can be reached from 3 by executing a, and if P holds in every 
state $ similarly accessible from 3 via a, then Q holds in every stale $ accessible from 3 via or. 

The second inference rule (the rule of necessitation of modal logic) can be considered as an 
upper bound on the power of programs, which cannot falsify theorems. If P is a theorem then P 
is true in every state, including states accessible via a. 

In our system it is straightforward to prove as theorems the axioms of, say, Mendelson's 
system K C183 (p. 57), and it should be clear that the second rule subsumes the rule of 
generalization; in fact, if the only modalities allowed are those with values of the form X:=? 
then the rule of necessitation is the rule of generalization, and the theorems of this system 
coincide with the theorems of K. It is interesting to note that Mendeison manages to express as 
one axiom what we take two to express, namely our Axiom M and the second quantification 
axiom. The advantage of our decomposition of this axiom is that we get two axioms about 
quantifiers that serve respectively as a lower and an upper bound on what the binary relation 
X:=? may be. 

So far we have only given axioms for random assignments. Now let us axiomatize the four 
loop-free programming constructs. 

CP?JQ s Pd Q 

[X; =t)P s P^ (see (211 for array assignment) 
fa | (33 P h IoiJPa [03 P 
(cc{03P s tor] (03P 

interestingly (but fairly obviously, as demonstrated in [21]), the axiom system with these 
four new axioms remains sound, complete and effective. (It is possible to give further axioms to 
handle the converse operation, still preserving soundness, completeness and effectiveness. 
However we shall not make use of this in the following.) 

A derived rule 

We could at this point proceed with the discussion of our ultimate objective, the 
construction of a proof-checking program that would check proofs expressed in the above axiom 
system. Unfortunately the above system is loo weak to permit reasonably succinct proofs; for 
example, it appears that 6 lines are needed to prove <X:=1>X=1 from the assumption 1=1 using the 
above system. In this section we explore a derived rule with an eye on strengthening the axioms 
and rules, in this respect we are emulating J.A. Robinson [22], who prescribed a new rule to 




facilitate the construction of automatic theorem-provers. The constraints on a proof-checker 
are somewhat different from those of a theorem-prover, and the arguments for Robinson's 
resolution rule are not sufficiently compelling for us. In particular, the convenience of having a 
clause as the unit of information, which helps an automatic theorem prover organize the proof, 
may be more hindrance than help in a proof-checker because the user may not have conceived his' 
proof in terms of clauses that are disjunctions of literals. This is not to say that we shall not 
make use of unification; indeed, unification is a most valuable tool in automated logic. 

We now give the details of the rule, which we call the Show Rule for lack or a more 
descriptive term. A proof step using it looks like 


Show S Cps> using TO {p0>, T1 {pl>, T2 < p 2>, ... 

For the moment ignore the items inside braces { }. Ideally, we would like this rule to apply 

whenever the formulae TO, Tl, T2, .„ logically entail the formula S, a semantic characterization 

of the rule. Unfortunately that would lead to a non-effective proof-checker, since logical 

entailment is not even partially decidable for our language [91 Instead we resort to an effective 

syntactic characterization. This is where the items in braces enter the picture. The braces 

enclose "templates" which contain the propositional content of the proof step, in the sense that 

each template is a propositional "approximation” to the formula it follows. For example, we might 
say 


Show fX: =11X: =2]X>0 CpAq> using 1>0 <p>, 2>0 {q>. 

The template pAq refers to the result of expanding CX: = 1|X:=2]X>0 first to 
CX:=1]X>0 a[X:= 2]X>0 and then to 1>0 a 2>0. It should be clear that the two uses of p in the 
templates refer to the same formula, 1>0, and similarly for the two uses of q. More generally, we 
shall require only that multiple occurrences of the same letter refer to unifiable formulae. 

We check this proof step in two phases, which can be done independently and in either order 

(or in parallel by two processors). One phase, called IDENTIFY, is to check that repetitions of 

the same letter can be justified. We do this by attempting to unify corresponding formulae. The 

other phase, called VERIFY, is to see whether the templates alone constitute a sound argument in 

modal propositional logic. In this example all modalities were eliminated so that we were left 
with the argument 


Show pAq using p, q 
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which is in fact a sound argument of non-modal propositional logic. A situation where modal 
logic would help is: 

Show [U; V] Y>0 {tori [|3]p> 

using CU3K=1 X*1:>MY>0 {q:>[(?]p}. 

Here we are dealing with "uninterpreted” programs U and V, a situation that arises when we are 
given a program about which we have previously proved some useful properties and whose code 
we no longer wish to be bothered with. (This situation arises frequently in the extended 
example of the next section but one.) In this case, knowing nothing about the programs U and V 
beyond the facts given, we could not expand them in the way we did with [X:-13, so they carry 
over to the templates. Here the argument of modal logic is: 

Show [a] t(3]p using [a] q, qo[0]p. 

This argument can readily be seen to follow if we apply Necessitation to q^[/3]p to get 
Cig](q3jj9]p) and hence [a]q3[<x3[j9]p. The rest is propositional reasoning. 

The IDENTIFY phase begins by determining what subformula each occurrence of a template 
letter refers to. This is done by systematically expanding the formula associated with the 
template containing the given letter until the formula can be matched to the template. Thus 
CU;V][W]X=0 will match [a]t0]p directly with a matched to U;V, b to W and p to X=0. However 
CU;V]X^0 will not match [«][#]p directly but must first be expanded as [U][V1X=0. CV|W3X=0 
will not match pAq directly but must first be expanded as [V]X=0 aCW 3X=0. Once the formula 
matches the template, the subformula corresponding to each letter can immediately be 
determined. Then all the subformulae corresponding to occurrences of the same letter are 
checked for whether they can be unified. This may require further expansion; for example, 
attempting to unify CX*=1]X>0 and W>0 involves eliminating the assignment modality to give 1>0, 
and instantiating W as 1, this latter step being performed by a unification algorithm. All 
instantiations necessary must be compatible with each other. 

Any formulae that fail to unify are put to one side while the remainder of the proof step is 
checked. When that is done, then the failed pairs are expressed as an equivalence and tested by 
a routine that checks for validity of quantifier-free Presburger arithmetic, in the hope that the 
formulae turn out to be equivalent on arithmetic grounds. (This together with the Rule of 
Convergence described in the next section is the only concession to domain-dependencies in the 
system.) 




The VERIFY phase is a satisfiability tester for modal propositional logic. It begins by 
determining what applications of the Rule of Necessitate are sufficient to make the proof go 
through. Boxes are then eliminated from the formula by the appropriate generalization of the 
transformation <«>PaC«]Q -> <«>(PaQ), which preserves satisfiability for the intuitively 
obvious reason that CortQ acts only as a constraint on those worlds one might construct (in 
attempting to satisfy <«>P) that are accessible via a and satisfy P, namely that in any such world 
Q must be true. In our present implementation, we first eliminate all top-level propositional 
letters by expressing the formula in conjunctive normal form and applying the Davis-Putnam 
algorithm for each of those letters. Then we convert the resulting formula involving only 
modalities to disjunctive normal form and apply the above transformation. Then the process is 
repeated on the arguments of the top-level diamond modalities. Though this approach can be 
inefficient, in practice on the kinds of formulae we encounter it is the most efficient of the 
methods we have tried. With ail boxes eliminated, the satisfiability of the result no longer 
depends on the names of the diamonds; that is, <a>Pv<0>Q and <or>Pv<a>Q are equally 
satisfiable. Indeed, satisfiability of the whole is preserved if <a>P is replaced by Uue when P is 
satssfiable and false when not. Thus we can proceed recursively, working up from the lowest 
diamonds to determine satisfiability of progessiveiy larger portions or the formulae. 

Axioms for programs with loops 

For programs with loops we have the following axioms and rules. 

<a n >P d <«*>P Axioms of Intent (one for each n). 

Pd [«1 P h Pd fa*3 P Ru I e o f I nvar i ance. 

n>z a P a Q(n) d <a>3w{z<w<n a Q(w)) 

P niz a Q(n) d <a*>(-iPA3u(zSwsrv\Q(n)) v Q(z)) 

Rule of Convergence 

These axioms and rules are explained and justified in more detail in [211 The first says that 
if a can halt in some number of steps then a* can halt. The second says that if a leaves P 
unchanged, then so does a*. (Observe how convenient it is to reason about iteration expressed 
in this form.) The third says that if a "drives" towards z without passing it, provided P remains 
true, then eventually a* will either make P false somewhere on the way to z, or it will reach z. 
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Example proof 

The following program was devised by Manna and Pnueli [16] to illustrate the efficacy of 
their method of proving termination. 

Y := 0} 

while X 4 0 do 

(while X 4 0 do (X : = X-lj Y := Y+l); 

Y s= Y-ls 

while Y 4 0 do (Y := Y-l; X := X+l)) 

This program represents an obscure way of setting both X and Y to 0, namely by first 
setting Y to 0, then copying X into Y by repeatedly decrementing X and incrementing Y, then 
decrementing Y once, then copying Y back into X by repeatedly decrementing Y and incrementing 
X. This process is repeated until X becomes 0. The point of this example was that it was 
supposed to be difficult to prove termination of this program by Floyd’s method but easy by the 
method described by Manna and Pnueli. Our own interest in this program besides the question of 
ease of proving termination (not a problem in dynamic logic) is that it is just the right size to 
illustrate the proof techniques appropriate to dynamic logic. 

We may write this program in the programming language dynamic logic caters for thus. 

PI: X>0? ? Xs =X—1; Y:=Y+1. 

P2: Y>0?j Y:=Y-1; X:=X+1. 

P3: X>0?; PI*; X=0?; Y:«Y-1; P2*; Y=0?. 

P4: Y:*0; P3*; X=0?. 

PI represents one step of "copying" a number from X to Y, while P2 represents one step of 
copying from Y to X. P1*;X=0? and P2*;Y=0? each represent the entire copying process, from X 
to Y and back again. P3 amounts to a program that, provided Y is initially 0, decrements X. P4 is 
then the whole program for setting X and Y to 0. The statement we want to prove is, <P4>1 (t 
denotes true) , which asserts that it is possible for P4 to halt. The following is the proof, which 
uses 7 hypotheses from arithmetic and 13 theorems. This proof is machine-readable. 

% The flanna-Pnue I i program % 

Pis X>0?; Xs =X-1; Y:=Y+1. 

P2: Y>0?; Y:=Y-1; X:=X+1. 

P3s X>0?; PI*; X=0?; Y:=Y-1; P2*; Y=0?. 





P4: Yt =0* P3*s X=0?. 


% Formulae occurring commonly in the proof% 
Ax(n)s X=nAY=0. Bx(m,nh X=nAX+Y=m. 

Ay(n)s Y=nAX=0. 8y(m,n); Y=nAX+Y=m. 

X Assumptions from arithmetic - not proved here % 
His Z=n+1 = Z-l=n. H2s l*l=n+l o U>0. 

H3j Ax(n) 5 Bx(n,n). H4s Ay(n) s By(n,n). 

H5: Ax(n) sBy(n,0). H6: Ay(n) =Bx(n,0). 

H7j U=U. 

Thml: Bx(m,n+1) d <Pl>Bx(m,n). 

Show Thml {pAr o <s?>(pAr)> using H2 <poe>. 

Thm2: By(m,n+1) o <P2>By(m,n). 

Show Thm2 <pnr d <s?>(pAr)> using H2 Cposl. 

Thm3s Bx(m,n) d <P1*>Bx( m,0). 

Use Convergence(n) Thm3 from Thml. 

Thm4: By(m,n) d <P2*>By(m,0). 

Use Convergence (n) Thm4 from Thm2. 

Thm5s Ax(n) o <Pl*>Ay(n). 

Show Thm5 -CpD<a>q> 

using Thm3 <po<a>q>. 

ThmG: Ax(n+1) o <Pl*>Ay(n+1). 

Show ThmG {p> using Thm3 <p>. 

Thm7: Ay(n) d <P2*>Ax(n). 

Show Thm7 •Cro<a>s> 

using Thm4 <po<a>q>, H4 <r=p>, H5 <s=q>. 

Thm8: Ay(n+1) d <Yj=Y- l>Ay(n). 

Show Thm8 <pAr d qAr> using HI <p=q>. 
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Thm9s Ax (n+1) o <P3>Ax(n). 

Show Thm9 -CxnlAyO d <xp?;a; xO?j dy; b; y0?> (xrv\yO) > 
using H2 -Cxnl:>xp>, 

ThmG {xnlAyO d <a>(ynlAxO)}, 

Thm7 {ynAxO o <fc»(xrv\yO)>, 

Thm8 <ynlAxO o <dy>(ynAxO)>. 

ThmlOs Ax(n) o <P3*>Ax(0). 

Use Performance(n) ThmlO from Thm9. 


Thmlls X=ro<Ys =0>(X=rv\Y«»0). 

Show Thmll CpopAq} using H7 <q>. 

Thml2: X=ro<P4>t. 

Show Thml2 {po<a;c 5 r?>t> 

using ThmlO <pAq3<o(rAq)>, Thmll <pD<a> (pAq)>. 

Thml3: <P4>t. 

Show Thml3 <p> using Thml2 <qDp>, H7 <q>. 

To avoid being distracted by extraneous issues such as arithmetic truth we have introduced 
all arithmetic facts in this proof as assumptions. (In fact, in the implemented system we have a 
very fast proof-checker for quantifier-free Presburger arithmetic, using quasi-Gaussian 
elimination.) 

The above proof is not the largest proof we have successfully checked with our system. A 
substantial part of a total correctness proof of the Knuth-Morris-Pratt pattern-matching 
algorithm has been machine-checked, and we are in the process of completing this proof. This 
extends work on the partial correctness of this algorithm by Wegbreit C241 

Discussion of the proof-checker 

We have constructed a system for checking proofs of the kind exemplified above. In this we 
are following in the footsteps of Milner [20,21,26], who is doing for Scott's Logic of Computable 
Functions what we are doing for the above modal extension to first-order logic. Inasmuch as we 
are treating programs that manipulate their environments, we are also continuing a tradition of 
several years of implementing systems for proving and checking proofs of properties of programs 
[4,8,13,14,23,241 However the greater expressive power of dynamic logic compared to that of 






partial correctness assertions (the language used in almost all such systems) adds considerably to 
the interest of our system. This consideration actually makes Milner’s system a closer relative 
of ours than the partial correctness systems, due to the greater emphasis on "expressions as 
first class citizens" in Milner's system and ours, resulting in a logic where programs and facts 
mingle more freely than say with Hoare's notation. The major difference between Milner’s 
system and ours is the LCF treatment of programs (computable functions) as individuals in the 
underlying domain versus our treatment of programs as "adverbs," analogously to quantifiers. 
Another system related to ours is Richard Weyhrauch’s [1,2.5] FOL (First-Order Logic) proof- 
checker. A detail in which our program differs from Milner's and Weyhrauch's (apart Trom the 
obvious one of choice of logical language) is that our program makes less of an effort to help the 
user interactively than is done by either LCF or FOL, but rather is, at least thus far, a system in 
which the user prepares his proof exactly as though he were writing a program. This means that 
his proof exists on a file and is read by the proof-checker just as an interpreter reads a program 
from a file. This has permitted us to focus all of our effort on the proof-checker proper. 

The proof-checker is implemented on the PDP-10 computer at M.I.T.’s Artificial Intelligence 
Laboratory. The program written to date has aproximately 100 LISP functions comprising a total 
of 1800 lines of code averaging 4 LISP atoms per line. The bulk of this code is for formula 
manipulation. However, a small amount of it is for book-keeping tasks of a relatively minor 
nature associated with keeping track of the structure of a proof. 

Directions for further research 


Although our immediate goals may not appear to be particularly ambitious or difficult to 
achieve, as well as not being obviously "Artificial Intelligence" research, we admit to far more 
ambitious and less plausible objectives on a larger time scale. Ultimately we see the proof- 
checker itself becoming a component of a variety of very intelligent program-manipulating 
programs. This depends on our belief that the ability to check proofs is a vital part of any 
program that pretends to "understand" some domain of discourse where the discussion is at all 
involved. Two applications that we would like to explore when the proof-checker has reached a 
satisfactory level of performance are (i) the automatic production of reliable software and (ii) 
machine-mediated reasoning about programs. Our plan of attack for each of these areas is not 
presently so crisp that we would feel confident embarking on either area forthwith, particularly 
the second, but we can nevertheless at this early stage present thoughts on these subjects. 

The notion of program reliability through correctness proofs has gained momentum in the 
past few years, spurred on most notably by the axiomatic methods of Floyd [7] and Hoare till 
As yet there is not a shred of hard evidence to suggest that this approach supplies the most 






economical approach to reliability (where the economics takes into account both the cost of 
having unreliable software and the total programming and maintenance cost). Indeed, it may well 
turn out that the bulk of the problems encountered today with unreliable software may be 
disposed of by a happy combination of a good programming language and a clean programming 
style. Nevertheless, if the proof-oriented approach can be made to work and does not put too 
great a burden on the programmer and/or the computer, it may provide reliable software at low 
cost. We feel it is well worth continuing the experiments that have been going on in this area in 
the past few years. Although these experiments have not thus far demonstrated the value of 
correctness proofs, it is still too early to draw any negative conclusions about the method in 
general. 

From a longer-range viewpoint, the burden of programming should become progressively 
more the computer's responsibility, requiring the computer to "understand" better the programs 
it executes. This has been the trend since the first assembler was used, and though the trend is 
perhaps not as pronounced as some have hoped, there is no doubt that the trend continues. As it 
does, methods of reasoning about programs will concomitantly become a more essential part of 
the computer's repertoire. This raises the question of the choice of language most appropriate to 
such reasoning. In view of the expressive power of dynamic logic we feel that it is worth 
developing the methodology of reasoning in this language with an eye to automating the reasoning 
as far as possible. A program like our proof-checker is precisely what is needed in the way of a 
"black box" that "accepts" a reasonably sized step in a discussion about a program. The sort of 
machine-mediated discussions we envisage could quite well be cast as proofs, albeit in the form 
of a dialogue. If the notion of a dialogue as a proof seems strange, visualize a conversation 
about a program - punctuated with "l don't see why you need that test there" and "How do you 
guarantee that X will never become negative?" Such conversations about programs arise all the 
time, and it is clear that the questions are referring to proofs, probably expressed informally but 
proofs nonetheless. One might argue that proof-checking is not understanding, but we would 
insist that it is at least a component of understanding. 

As humans are taken progressively further out of the loop (admittedly a very long-range 
view) the dialogue will become more of a monologue. However it may still be appropriate for the 
computer to reason about the programs it is contemplating using a language like dynamic logic. 
Thus even in this scenario the basic proof-checking methodology may continue to be used. We 
should add that we see nothing strange in the idea of a computer checking proofs that it 
generated itself; the best way to generate proofs may be to propose possibly faulty proofs and 
subject them to detailed criticism. This would require not only the error-detecting capability of 
our proposed proof-checker but error-correcting capabilities as well. 



David Harel and Albert Meyer made substantial contributions to the theoretical 
underpinnings of dynamic logic. We thank Derek Oppen and Richard Weyhrauch for many helpful 
ARPA-net-mediated conversations on theorem-proving and proof-checking. 
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